PartitionTuner: An operator scheduler for deep‐learning compilers supporting multiple heterogeneous processing units

نویسندگان

چکیده

Recently, embedded systems, such as mobile platforms, have multiple processing units that can operate in parallel, centralized (CPUs) and neural (NPUs). We use deep-learning compilers to generate machine code optimized for these systems from a deep network (DNN). However, the proposed so far codes sequentially execute DNN operators on single unit or parallel graphic (GPUs). In this study, we propose PartitionTuner, an operator scheduler supports heterogeneous PUs including CPUs NPUs. PartitionTuner operator-scheduling plan uses all available simultaneously minimize overall inference time. Operator scheduling is based analysis of architecture performance profiles individual group measured units. By experiments seven DNNs, generates plans perform 5.03% better than static type-based technique SqueezeNet. addition, outperforms recent profiling-based techniques ResNet50, ResNet18, SqueezeNet by 7.18%, 5.36%, 2.73%, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using compilers for heterogeneous system design

Heterogeneous systems combine both data and control processing functions. A programmable DSP core forms the central component. The design of such systems establishes a new application of compilers in electronic CAD: In order to meet given real-time constraints and optimize chip area consumption, the DSP core needs to be customized for each application. In turn, this requires compiler support fo...

متن کامل

The Case for Multiple Compilers

For virtual machine implementations to achieve high performance, some form of translation of the virtual machine's input language into the native code of the host machine seems necessary. This translation process is often called justin-time (a.k.a. JIT) compilation, or sometimes dynamic compilation. The use of JIT compilation introduces a tension in virtual machine design: compilation time adds...

متن کامل

SHRED: a CPU Scheduler for Heterogeneous Applications

General purpose workstations must support a wide variety of application characteristics; but it is hard to find a single CPU scheduling scheme that satisfactorily schedules processes from all types of applications. It is particularly difficult to get periodic deadline-driven continuous media processes to satisfactorily coexist with others. A number of schemes have been proposed to address this ...

متن کامل

Data-Replicas Scheduler for Heterogeneous MapReduce Cluster

Large scale data processing has rapidly increased in nowadays. MapReduce programming model, which is firstly mentioned in functional languages, appeared in distributed system and perform excellently in large scale data processing since 2006. Hadoop, which is the most popular framework of open-sourced MapReduce runtime environment, supplies reliable, scalable and distributed system processing la...

متن کامل

Issues in Supporting Interoperable Query Processing with Multiple Heterogeneous Information Servers 1

This project has three goals. First, we support interoperable query processing among multiple heterogeneous sources. Heterogeneity arises due to the existence of diierent data models, schema and query languages. For an initial query posed w.r.t. a schema and query language, we produce mediated (transformed) queries w.r.t. other schema/query languages. The second goal is to modify traditional qu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Etri Journal

سال: 2023

ISSN: ['1225-6463', '2233-7326']

DOI: https://doi.org/10.4218/etrij.2021-0446